Members
Overall Objectives
Research Program
Application Domains
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: New Results

Content-Oriented Systems

Participants : Sara Alouf, Konstantin Avrachenkov, Nicaise Choungmo Fofack, Delia Ciullo, Alain Jean-Marie, Philippe Nain, Giovanni Neglia, Marina Sokol.

Performance evaluation of hierarchical TTL-based cache networks

N. Choungmo Fofack, P. Nain and G. Neglia, together with D. Towsley (Univ. of Massachusetts at Amherst, USA) have revisited and extended the work that has appeared in [82] . They consider caches that implement an expiration-based eviction policy to manage contents in their memory. These caches are called Time-To-Live (TTL)-based caches. These TTL-based caches can be used to model caches running classical replacement policies such as Least Recently Used (LRU) and Random Replacement (RND). The main characteristic of the latter TTL-based cache models is that they (re)initialize the TTL of a content at both cache hit and cache miss. In a paper that is currently under review, the case of a network of caches where requests for each content are routed as a polytree is analyzed and a framework to evaluate the performance of such general TTL-based cache networks is proposed.

Modeling modern DNS caches

Motivated by the recent behavior of Domain Name System (DNS) caches that do not respect the timeout marked (by Authoritative DNS servers) on resource records, N. Choungmo Fofack and S. Alouf propose in  [44] a theoretical model based on renewal arguments to describe this modern behavior. The proposed model for a cache taken in isolation is validated with real traces collected by Inria's IT service at Sophia-Antipolis at one of the Inria's DNS caches. The model of a network of caches is validated by event-driven simulations. This study suggests that, when inter-request times have a concave cumulative distribution function, client caches (those caches that are fed directly by users requests) should keep each resource record for a constant duration (that may depend on its popularity). However, core caches should draw their timeout values for each record from a distribution which has as high coefficient of variation as possible.

An approximate analysis of general and heterogeneous cache networks

Jointly with M. Dehghan, D. L. Goeckel and D. Towsley (Univ. of Massachusetts at Amherst, USA), N. Choungmo Fofack proposes a simple, accurate, and computationally efficient framework to assess performance of network of caches with arbitrary topology, requests described by renewal processes, and caches running Least Recently Used (LRU), First-In First-Out (FIFO), or Random Replacement (RND) policies. Their framework is based on the characteristic time approximation of LRU, RND and FIFO caches that helps to model the latter as TTL-based caches. Classical results of the theory of (renewal) point processes (e.g. approximation of general point processes by renewal processes, thinning a renewal point process, aggregating/merging independent renewal processes) are used as well as theoretical results established in  [82] and [44] on TTL-based caches (e.g. calculation of metrics of interest such hit and occupancy probabilities, characterization of miss streams).

Data placement

Jointly with J.-C. Bermond (Inria project-team Coati ), D. Mazauric (Univ. Aix-Marseille) and J. Yu (UFV Vancouver), A. Jean-Marie has pursued the study of combinatorial designs that solve the problem of replicating optimally data over unreliable servers, with the objective of minimizing the variance of the availability of documents. In a forthcoming revision of [81] , they use results from Design Theory, particularly the existence of “large triple systems” to solve multiple instances of the problem.

Semi-supervised learning with application to P2P systems

Semi-supervised learning methods constitute a category of machine learning methods which use labelled points together with unlabelled data to tune the classifier. The main idea of the semi-supervised methods is based on an assumption that the classification function should change smoothly over a similarity graph, which represents relations among data points. This idea can be expressed using kernels on graphs such as graph Laplacian. Different semi-supervised learning methods have different kernels which reflect how the underlying similarity graph influences the classification results. In [41] K. Avrachenkov, P. Gonçalves (Inria project-team Dante ) and M. Sokol analyze a general family of semi-supervised methods, provide insights about the differences among the methods and give recommendations for the choice of the kernel parameters and labelled points. In particular, it appears that it is preferable to choose a kernel based on the properties of the labelled points. They illustrate our general theoretical conclusions with an analytically tractable characteristic example, clustered preferential attachment model and classification of content in P2P networks.